Export Snapshot Profiling Stack Traces As OpenTelemetry Log Messages#2237
Export Snapshot Profiling Stack Traces As OpenTelemetry Log Messages#2237laurit merged 32 commits intosignalfx:mainfrom
Conversation
…ted into the pprof data format as otel logs.
…io.opentelemetry.sdk.autoconfigure package.
…g is properly configured and that logs are exported to the OpenTelemetry log exporter.
…essage attributes.
…LogExportingTest.
| annotationProcessor("com.google.auto.service:auto-service") | ||
| compileOnly("com.google.auto.service:auto-service") | ||
|
|
||
| testImplementation(project(":custom")) |
There was a problem hiding this comment.
I needed access to the AutoConfigureUtil class defined in this subproject. Used in StackTraceExporterActivator to access the ConfigProperties and Resource fields in the AutoConfiguredOpenTelemetrySdk when configuring the Otel Logger.
The compileOnly access already defined did not allow me to test that the configuration was taking place.
|
|
||
| public static boolean isSnapshotProfilingEnabled(ConfigProperties properties) { | ||
| return properties.getBoolean(CONFIG_KEY_ENABLE_SNAPSHOT_PROFILER, false); | ||
| } |
There was a problem hiding this comment.
A 2nd use was added in StackTraceExporterActivator so I thought it made sense to roll this into the Configuration class.
There was a problem hiding this comment.
Adapted from code in JfrActivator. I intentionally didn't modify the existing profiling code to minimize the scope of this PR.
There was a problem hiding this comment.
Very similar to the existing Pprof class, mostly QoL/DX changes. I decided to port the code AS IN, duplications and all, due to timeline constraints. Once the extension functionality is in place my intention is to figure out how to unify the two slightly different strategies used by the continuous profiler and the snapshot profiler.
There was a problem hiding this comment.
Similar to PprofCpuEventExporter though slightly simpler due to guarantees that thread and trace information will be available.
There was a problem hiding this comment.
Sets up the StackTraceExporter instance so that collected stack traces can be exported as Otel logs. If this step doesn't happen traces may be profiled but the data will be dropped, not exported.
There was a problem hiding this comment.
This class acts a middleman between AccumulatingStagingArea and the StackTraceExporter. Its necessary because the StagingArea is instantiated during the SDK configuration phase (see updates in SnapshotProfilingSdkCustomizer) but the Otel Logger can't be configured until later in agent startup process (see newly added StackTraceExporterActivator).
To insulate the AccumulatingStagingArea from this 2-step initialization requirement it uses this Supplier to obtain a StackTraceExporter, always receiving a no-op implementation if the initialization process has yet to complete.
There was a problem hiding this comment.
Component test of the extension's ability to export logs. Selects a trace for snapshotting, starts and stops a root span, and verifies that a log record lands in the configured Otel LogRecordExporter. Utilizes the additional simulation of an agent startup in OpenTelemetrySdkExtension.
There was a problem hiding this comment.
The primary changes in this class is the ability to add attach AgentListener instances and simulating that portion of the agent startup process. To do this requires a AutoConfiguredOpenTelemetrySdk instance which is difficult to obtain on a per-test basis. The easiest approach was to subclass AutoConfiguredOpenTelemetrySdk which unfortunately required moving the entire JUnit extension into this shadow package so two of the abstract methods could be implemented.
|
There is obviously a lot of overlap between code added in this PR and existing code for the continuous profiler. My initial aim was to unify the two approaches by adapting the new snapshot profiling code to use the existing constructs, however as I started down that path it became clear there were enough nuances between the implementations that there wasn't a straightforward way to reconcile the differences. Given that I decided the safest thing to do is port the snapshot profiling log exporting code more-or-less as is and attempt the |
profiler/src/main/java/com/splunk/opentelemetry/profiler/snapshot/AsyncStackTraceExporter.java
Outdated
Show resolved
Hide resolved
profiler/src/main/java/com/splunk/opentelemetry/profiler/snapshot/AsyncStackTraceExporter.java
Outdated
Show resolved
Hide resolved
| * which is itself adapted from Google's Bazel build system <a | ||
| * href=https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/profiler/memory/AllocationTracker.java>here</a>. | ||
| */ | ||
| class Pprof { |
There was a problem hiding this comment.
To me this copy paste does not make sense. Have a look at https://github.com/signalfx/splunk-otel-java/pull/1306/files#diff-ca57673d0f9c0ab8a63ac33abeb7aa277d52fce7a74774eb6e2ace458c01c88e Would it be complicated to use a similar approach? If it helps I guess we could merge consider merging that PR (needs approval from @breedx-splk) or you could copy from that PR. Note that gdi spec dictates the values for missing stack trace items (line numbers, method, class and file names).
There was a problem hiding this comment.
I tried to explain here and here why I went this way.
The longer version is that I was tinkering for a few hours on how to reconcile the differences between the StackTrace defined in the snapshot package with the StackTrace defined in StackTraceParser used in PprofCpuEventExporter. The two data classes are very similar with the primary differences being the osThreadId field and how the stack frames are represented (StackTraceLine vs the built-in StackTraceElement). The approach I was exploring was to define StackTrace and StackFrame interfaces with the relevant accessor methods.
The next obstacle was around the building and serialization of the profiling data. PprofCpuEventExporter maintains a mutable Pprof instance that it resets after being flushed where as the design of the snapshot profiling extension does not maintain any mutable state during the log export process. The location of the serialization code also differs with it being found in the Pprof class itself vs in the AsyncLogExporter for snapshots.
There's also a minor DX change on the snapshot profiling Pprof version where the label creation and adding of labels to Samples is separated into two distinct operations and pushed up a level into the PprofTranslator as well the addition a few convenience methods to accept a StackTraceElement. Ex:
// Always On
...
if (stackTrace.getThreadId() != 0) {
pprof.addLabel(sample, THREAD_ID, stackTrace.getThreadId());
pprof.addLabel(sample, THREAD_NAME, stackTrace.getThreadName());
}
...
for (StackTraceParser.StackTraceLine stl : stackTrace.getStackTraceLines()) {
sample.addLocationId(
pprof.getLocationId(
stl.getLocation(), stl.getClassName(), stl.getMethod(), stl.getLineNumber()));
pprof.incFrameCount();
}
// Snapshot Profiling
...
sample.addLabel(pprof.newLabel(THREAD_ID, stackTrace.getThreadId()));
sample.addLabel(pprof.newLabel(THREAD_NAME, stackTrace.getThreadName()));
...
for (StackTraceElement stackFrame : stackTrace.getStackFrames()) {
sample.addLocationId(pprof.getLocationId(stackFrame));
pprof.incFrameCount();
}Finally, I also thought it was a better idea to tackle the unification of the two stack trace data models and their translation into the pprof data format separately so as to not unintentionally break the existing Always On profiling extension while the snapshot profiling extension is being ported.
There was a problem hiding this comment.
Also see here for another place where I intentionally did not modify the existing profiling code to minimize the scope of this PR.
There was a problem hiding this comment.
@laurit, thanks for doing this! I've merged that PR into the branch, added tests, and also added new configuration properties to allow control over both the stack depth and sampling period of the snapshot profiler.
| @@ -93,10 +97,12 @@ public void run() { | |||
| Instant now = Instant.now(); | |||
| try { | |||
| ThreadInfo threadInfo = threadMXBean.getThreadInfo(threadId, MAX_ENTRY_DEPTH); | |||
There was a problem hiding this comment.
In the non-snapshot profiler there is a configuration option for this
On thing you need to handle is that the for stack traces that are truncated you need to set an extra attribute. For this to be possible you may need to set a slightly larger limit here and manually truncate later.
...src/main/java/com/splunk/opentelemetry/profiler/snapshot/SnapshotProfilingSdkCustomizer.java
Show resolved
Hide resolved
...main/java/com/splunk/opentelemetry/profiler/snapshot/ScheduledExecutorStackTraceSampler.java
Outdated
Show resolved
Hide resolved
| import java.util.concurrent.ConcurrentMap; | ||
| import java.util.function.Supplier; | ||
|
|
||
| class AccumulatingStagingArea implements StagingArea { |
There was a problem hiding this comment.
Here you collect stack traces per trace id and when the trace is done you export them. To me the staging area abstraction doesn't seem ideal. When there is a long request you end up accumulating many stack traces, when the request is short you export a small batch with only one or two traces. When there are multiple concurrent short requests you'll export many small batches. With the non-snapshot profiler we already get the input data in batches and the interval between the batches is large so we don't need to handle this. Did you consider alternative solutions like for example what is used in https://github.com/open-telemetry/opentelemetry-java/blob/main/sdk/trace/src/main/java/io/opentelemetry/sdk/trace/export/BatchSpanProcessorBuilder.java Place data in the queue, if queue has batch size items (there it is 512) export them, otherwise wait for a bit (there 5s) and export whatever is in the queue. If queue grows too large drop data.
There was a problem hiding this comment.
This sounds like a very good idea. Would you be against doing this as a followup (this PR is a partial blocker to adding span correlation)?
I don't think a time-based approach is necessarily the right approach however. What I'm envisioning is a BufferingStagingArea with some number of max stack traces before it automatically empties itself in addition to the forced empty upon detection of the entry span ending. I don't know what that number should be; maybe 20?
This would control how many stack traces are bundled into a single log record. The frequency of the log record exporting is separately controlled by the LogRecordProcessor configured in OtelLoggerFactory. Right now it wires up a SimpleLogRecordProcessor which I believe immediately exports the log record. We could use In the BatchLogRecordProcessor instead, though I've had issues getting this to work in the past.
There was a problem hiding this comment.
Would you be against doing this as a followup
sure
I don't think a time-based approach is necessarily the right approach however. What I'm envisioning is a BufferingStagingArea with some number of max stack traces before it automatically empties itself in addition to the forced empty upon detection of the entry span ending. I don't know what that number should be; maybe 20?
Flushing when there are no new stack traces in some time isn't that hard. When you write stacks from all ongoing traces to the same pprof then flushing on ending the entry span doesn't make sense. 20 is imo too low, 2000 or even larger would be better. I think that determining the batch size requires some experimentation.
This would control how many stack traces are bundled into a single log record. The frequency of the log record exporting is separately controlled by the LogRecordProcessor configured in OtelLoggerFactory. Right now it wires up a SimpleLogRecordProcessor which I believe immediately exports the log record. We could use In the BatchLogRecordProcessor instead, though I've had issues getting this to work in the past.
Using BatchLogRecordProcessor here isn't ideal because pprof works best when you put lots of stacks into the same batch. When the same string is used multiple times then the shared string pool will help in reducing the transfer size. We need to manage creating these batches ourself to not export too often so we really don't gain much from using the BatchLogRecordProcessor.
|
Building an implementation of |
Share pprof code
…s collected by the snaphot profiler.
…max stack trace depth for the snapshot profiler.
| private static final java.util.logging.Logger logger = | ||
| java.util.logging.Logger.getLogger(AsyncStackTraceExporter.class.getName()); | ||
|
|
||
| private final ExecutorService executor = Executors.newSingleThreadScheduledExecutor(); |
There was a problem hiding this comment.
I think you meant to use newSingleThreadExecutor here.
| } | ||
| cpuEventExporter.flush(); | ||
| } catch (Exception e) { | ||
| logger.log(Level.SEVERE, "an exception was thrown", e); |
There was a problem hiding this comment.
Perhaps change to something like An exception was thrown while exporting profiling snapshots.
| stackTrace.getStackFrames(), | ||
| stackTrace.getTimestamp(), | ||
| stackTrace.getTraceId(), | ||
| null); |
There was a problem hiding this comment.
I'm not sure how the profiling UI will handle the missing span id here. Perhaps it would be better to use the local root span id here?
There was a problem hiding this comment.
The next PR adding span correlation will fill this detail in. To your concern a different set of UI screens is being created for snapshot profiling data and the two sources of callstacks will not overlap at any point in the UI.
|
|
||
| try { | ||
| sampler.start(spanContext); | ||
| await().atMost(HALF_SECOND).until(() -> !staging.allStackTraces().isEmpty()); |
There was a problem hiding this comment.
it is best not to use short wait times like atMost(HALF_SECOND) as this can result in sporadic test failures
There was a problem hiding this comment.
I removed all of the explicit timeouts from these tests so the 10 second default is used.
…ot profiling sampling interval.
This PR plugs in the log exporting functionality of the snapshot profiler which completes the end-to-end extension data flow. The full extension sequence is now:
AccumulatingStagingAreaStagingAreatoStackTraceExporterpprofdata formatLogRecordDatais created usingpprofas log bodyLogRecordDatais emitted to the configured OtelLogRecordProcessorThe configuration of the Otel
Loggerrequired adding an additionalAgentListenerto obtain theResource, which in turn necessitated a 2-step initialization and an additional abstraction layer withinAccumulatingStagingArea.The need to utilize the
AgentListenercallback also required further simulating of the agent startup process which was done by updatingOpenTelemetrySdkExtensionto support addingAgentListenersand calling into them prior to a test running. This did requireOpenTelemetrySdkExtensionto extendAutoConfiguredOpenTelemetrySdkand move into shadow package to actually implement the abstract methods along with renaming thebuilderfunction toconfigureso as to not conflict withbuilderfunction already defined.